feat(init): add init command for guided Sentry project setup by betegon · Pull Request #283 · getsentry/cli

betegon · 2026-02-23T21:14:22Z

Summary

Adds sentry init — an AI-powered wizard that walks users through adding Sentry to their project. It detects the platform, installs the SDK, instruments the code, and configures error monitoring, tracing, and session replay.

Changes

New init command backed by a Mastra AI workflow (hosted at getsentry/cli-init-api) that handles platform detection, SDK installation, and code instrumentation
ASCII banner, AI transparency note, and review reminder in the wizard UX
Tracing: unique trace IDs per wizard run with flattened span hierarchy
Python platforms use venv for isolated dependency installation
Command execution guardrails: shell metacharacter blocking, dangerous executable blocklist, path-traversal prevention
Magic values extracted into named constants (constants.ts)
Docs page added to cli.sentry.dev
Eval test suite (test/init-eval/) — see below

Eval suite

The eval suite validates that the wizard produces correct, buildable Sentry instrumentation for each supported platform. It uses a 3-phase test architecture:

Phase 1: Wizard run

Each test scaffolds a fresh project from a platform template, then runs the full sentry init wizard against it. The wizard output (exit code, stdout/stderr, git diff, new files) is captured for the next phases.

Phase 2: Hard assertions (deterministic)

Five code-based pass/fail checks that run without any LLM:

exit-code — wizard exits 0
sdk-installed — the Sentry SDK package appears in the dependency file (package.json / requirements.txt)
init-present — Sentry.init (or sentry_sdk.init) appears in changed or new files
no-placeholder-dsn — no leftover placeholder DSNs (___PUBLIC_DSN___, YOUR_DSN_HERE, etc.)
build-succeeds — npm run build / equivalent passes after the wizard's changes

Phase 3: LLM judge (per-feature)

For each feature (errors, tracing, replay, logs, profiling, etc.), an LLM judge scores correctness:

Official Sentry docs are fetched as ground truth (URLs mapped in feature-docs.json)
GPT-4o evaluates the wizard's diff + new files against the docs on 4 criteria: feature-initialized, correct-imports, no-syntax-errors, follows-docs
Each criterion is scored pass/fail/unknown; the overall feature score must be >= 0.5

Platforms

6 platform templates are covered:

Platform	Template	SDK
Express	`express/`	`@sentry/node`
Next.js	`nextjs/`	`@sentry/nextjs`
SvelteKit	`sveltekit/`	`@sentry/sveltekit`
React + Vite	`react-vite/`	`@sentry/react`
Flask	`python-flask/`	`sentry-sdk`
FastAPI	`python-fastapi/`	`sentry-sdk`

Running

bun run test:init-eval          # all platforms

Requires SENTRY_AUTH_TOKEN, SENTRY_ORG, SENTRY_PROJECT, and optionally OPENAI_API_KEY (LLM judge is skipped without it).

Test Plan

bun run test:init-eval passes for all 6 platforms
bun run lint and bun run typecheck pass
CI passes (unit tests, e2e, lint, typecheck, build)
CI workflow for eval is tracked separately in Run init evals on CI #290

🤖 Generated with Claude Code

Adds `sentry init` wizard that walks users through project setup via the Mastra API, handling DSN configuration, SDK installation prompts, and local file operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Sends tags and metadata (CLI version, OS, arch, node version) with startAsync and resumeAsync calls so workflow runs are visible and filterable in Mastra Studio. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Import randomBytes and generate a hex trace ID so all suspend/resume calls within a single wizard run share one trace. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add a synthetic parentSpanId to tracingOptions so all workflow run spans become siblings under the same parent instead of nesting by timestamp containment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The parentSpanId was creating artificial nesting - let the workflow engine handle span hierarchy naturally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Display the branded SENTRY ASCII banner before the intro line for visual consistency with `sentry --help`. Make the "errors" feature always enabled in the feature multi-select so users cannot deselect error monitoring. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…pt, and source maps hint Route success-with-exitCode results to formatError so the --force hint is shown when Sentry is already installed. Fold the "Error Monitoring is always included" note into the multiselect prompt. Use a more approachable Source Maps hint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Show a non-blocking info note about AI usage with a docs link before the first network call, and a review reminder before the success outro. Extract SENTRY_DOCS_URL constant to share between wizard-runner and clack-utils cancel message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add @anthropic-ai/sdk and openai as devDependencies for the LLM-as-judge eval framework. Add opencode-lore dependency. Exclude test/init-eval/templates from biome linting since they are fixture apps, not source code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add LLM-as-judge eval tests for the init wizard across all five platforms (Express, Next.js, Flask, React+Vite, SvelteKit). Each test runs the wizard end-to-end and asserts on SDK installation, Sentry.init presence, build success, and documentation accuracy via an LLM judge. Includes template apps, helper utilities (assertions, doc-fetcher, judge, platform configs), and feature-docs.json mapping. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add a separate workflow for running init-eval tests on demand. Supports running a single platform or all platforms via matrix. Uses the init-eval GitHub environment for MASTRA_API_URL and OPENAI_API_KEY secrets. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Store python-fastapi doc URLs as base paths (with trailing slash) like other platforms, and convert to .md at fetch time. This mirrors the pattern in cli-init-api and lets us return clean markdown directly instead of stripping HTML tags. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add Sentry doc URLs for python-flask (getting-started, errors, tracing, logs, profiling) and add the shared python/profiling page to both flask and fastapi profiling entries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add Sentry doc URLs for all nextjs features: getting-started, errors, logs, tracing, session replay, metrics, and profiling (browser + node). Sourcemaps left empty for now. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add Sentry doc URLs for sveltekit features and add missing logs, metrics, and profiling features to the platform entry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add Sentry doc URLs for react-vite features and add missing logs, metrics, and profiling features to the platform entry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Flask eval was using bare `pip install` which fails when pip isn't on PATH. Use the same venv pattern as fastapi. Also remove accidental opencode-lore runtime dependency. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-23T21:15:24Z

Semver Impact of This PR

🟡 Minor (new features)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).

New Features ✨

(formatters) Render all terminal output as markdown by BYK in #297

(init) Add init command for guided Sentry project setup by betegon in #283

(issue-list) Global limit with fair distribution, compound cursor, and richer progress by BYK in #306

Bug Fixes 🐛

Api

Use numeric project ID to avoid "not actively selected" error by betegon in #312
Use limit param for issues endpoint page size by BYK in #309
Auto-correct ':' to '=' in --field values with a warning by BYK in #302

Formatters

Expand streaming table to fill terminal width by betegon in #314
Fix HTML entities and escaped underscores in table output by betegon in #313

Other

(ci) Generate JUnit XML to silence codecov-action warnings by BYK in #300
(nightly) Push to GHCR from artifacts dir so layer titles are bare filenames by BYK in #301
(test) Handle 0/-0 in getComparator anti-symmetry property test by BYK in #308

Internal Changes 🔧

(api) Wire listIssuesPaginated through @sentry/api SDK for type safety by BYK in #310

_{🤖 This preview updates automatically when you update the PR.}

.github/workflows/init-eval.yml

github-actions · 2026-02-23T21:15:47Z

Codecov Results 📊

✅ 45 passed | Total: 45 | Pass Rate: 100% | Execution Time: 0ms

📊 Comparison with Base Branch

Metric	Change
Total Tests	📉 -2246
Passed Tests	📉 -2246
Failed Tests	—
Skipped Tests	—

All tests are passing successfully.

✅ Patch coverage is 94.97%. Project has 3854 uncovered lines.
❌ Project coverage is 78.89%. Comparing base (base) to head (head).

Files with missing lines (4)

File	Patch %	Lines
`wizard-runner.ts`	82.59%	⚠️ 35 Missing
`app.ts`	81.74%	⚠️ 21 Missing
`local-ops.ts`	97.49%	⚠️ 8 Missing
`help.ts`	97.39%	⚠️ 3 Missing

Coverage diff

@@            Coverage Diff             @@
##          main       #PR       +/-##
==========================================
- Coverage    80.14%    78.89%    -1.25%
==========================================
  Files          120       127        +7
  Lines        16316     18254     +1938
  Branches         0         0         —
==========================================
+ Hits         13075     14400     +1325
- Misses        3241      3854      +613
- Partials         0         0         —

Generated by Codecov Action

Restrict GITHUB_TOKEN to contents:read as flagged by CodeQL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update SvelteKit template with working deps (adapter-node, latest svelte/vite) and add required src files (app.d.ts, app.html). Use python3 instead of python for venv creation in Flask/FastAPI platforms. Add --concurrency 6 to init-eval test runner. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add push/pull_request triggers so the eval runs automatically alongside other CI checks. Keep workflow_dispatch for manual single-platform runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move hardcoded numeric values, string literals, and exit codes into constants.ts for better readability and maintainability across the init module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Move regex to top-level constant (useTopLevelRegex) - Remove unused template literal (noUnusedTemplateLiteral) - Replace explicit `return undefined` with bare `return` (noUselessUndefined) - Apply formatter to both source and test files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add tests for local-ops (FS operations, command execution, patchset application), formatters (result/error display), help (banner/custom help output), interactive prompts (select/multiselect/confirm), and wizard-runner (TTY check, success/error paths, suspend/resume loop). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test/lib/init/formatters.test.ts

Keep both openai and marked dependencies, add test:init-eval script back, and take main's version (0.14.0-dev.0) and restructured package.json layout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…into feat/init-command

…coverage Include test/isolated in the test:unit coverage run so that existing comprehensive tests for wizard-runner and interactive modules count toward patch coverage. Add new tests for init command parsing, clack-utils utilities, cancel paths, and wizard-runner edge cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Bun's mock.module() leaks between test files in the same run. Keep test:unit and test:isolated as separate invocations, add coverage flags to test:isolated, and merge lcov reports before upload. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/lib/init/local-ops.ts

The init-command test's mock.module() for interactive.js was poisoning init-interactive.test.ts in the same bun test run. Moved to test/commands/ with a single wizard-runner.js mock instead of 7 redundant mocks — no other test in test/commands/ depends on that module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/lib/init/wizard-runner.ts

src/lib/init/local-ops.ts

bun's mock.module() leaks across files when run in a single process. Run each test/isolated/*.test.ts file in its own bun test invocation to ensure true mock isolation, accumulating LCOV coverage for CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Running each isolated test in its own bun process produces overlapping coverage for shared source files. Concatenating these LCOV files created duplicate SF entries that codecov counted as separate files, inflating line counts and dropping project coverage from 80% to 51%. Add script/merge-lcov.sh (awk) to deduplicate by source file, taking the max hit count per line, so codecov sees 51 unique files instead of 133 duplicate entries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/lib/init/local-ops.ts

Convert init-interactive and init-wizard-runner tests from isolated mock.module() pattern to spyOn() on namespace imports, eliminating mock leakage without process isolation. Also fix CI coverage merge to deduplicate LCOV entries via merge-lcov.sh. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/lib/init/local-ops.ts

spyOn on local TS module exports doesn't intercept in bun on Linux, so wizard-runner must stay isolated with mock.module(). The interactive test remains in test/lib/init/ since it only spies on @clack/prompts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Wrap fd operations in try/finally so fs.closeSync is always called, even if fs.readSync throws an I/O error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/lib/init/local-ops.ts

…rm shell Add >, <, and & to SHELL_METACHARACTER_PATTERNS to prevent redirection (e.g. `npm install foo > /arbitrary/path`) and background execution (e.g. `npm install foo & curl evil.com`) in validated commands. Replace hardcoded `spawn("sh", ...)` with `spawn(command, [], { shell: true })` so Node selects the platform-appropriate shell (cmd.exe on Windows). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The remote workflow controls payload.cwd, but handleLocalOp never checked that it falls within options.directory. A misbehaving workflow could set cwd:"/" to escape the path sandbox entirely. Now cwd is validated at the top of handleLocalOp before any operation runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

runWizard swallows errors and returns void, but most error paths were missing process.exitCode = 1, causing failed initializations to exit with code 0 in CI/CD pipelines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

src/lib/init/wizard-runner.ts

src/lib/init/local-ops.ts

- Move createRun() into try/catch so network failures get graceful "Connection failed" message instead of an unhandled stack trace - Block shell expansion characters ($, ', ", \) in validateCommand to prevent bypass via ANSI-C quoting, variable expansion, and escapes - Remove unused stdout/stderr/stdin fields from WizardOptions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sentry · 2026-03-02T21:19:55Z

src/lib/init/wizard-runner.ts

+      const payload = extractSuspendPayload(result, stepId);
+      if (!payload) {
+        spin.stop("Error", 1);
+        log.error(`No suspend payload found for step "${stepId}"`);
+        cancel("Setup failed");
+        return;
+      }


Bug: A specific error path in the setup wizard exits with code 0 (success) instead of 1 (failure) when a suspend payload is missing, causing silent failures.
_{Severity: MEDIUM}

Suggested Fix

Before the return; statement on line 182 in the if (!payload) block, add process.exitCode = 1; to ensure the process exits with a failure code, consistent with other error-handling paths in the function.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/lib/init/wizard-runner.ts#L177-L183 Potential issue: In the `runWizard` function, if `extractSuspendPayload` returns an undefined payload, the function logs an error and then executes an early `return` on line 182. This specific error path fails to set `process.exitCode = 1` before exiting. As a result, a failed setup wizard run will incorrectly report a success status (exit code 0) to the calling process. This behavior is inconsistent with all other failure paths within the same function, which correctly set the exit code to 1, and can cause CI/CD pipelines or automation scripts to misinterpret a failure as a success.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-02T21:23:42Z

src/lib/init/local-ops.ts

+    { pattern: ">", label: "redirection (>)" },
+    { pattern: "<", label: "redirection (<)" },
+    { pattern: "&", label: "background execution (&)" },
+  ];


Parentheses bypass shell metacharacter and executable blocklist

High Severity

SHELL_METACHARACTER_PATTERNS blocks $( but not standalone ( or ). Since runSingleCommand uses shell: true, a command like (rm -rf .) passes all validation: no blocked metacharacters match, and the first-token extraction yields (rm whose path.basename is "(rm" — not in BLOCKED_EXECUTABLES. The shell then interprets (...) as a subshell, executing the blocked executable. This defeats both the metacharacter check and the executable blocklist for any command wrapped in parentheses.

Additional Locations (1)

src/lib/init/local-ops.ts#L113-L133

betegon and others added 21 commits February 17, 2026 20:48

feat(init): add init command for guided Sentry project setup

4d6ddde

Adds `sentry init` wizard that walks users through project setup via the Mastra API, handling DSN configuration, SDK installation prompts, and local file operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(init): pass tracing options to Mastra workflow runs

8146d8b

Sends tags and metadata (CLI version, OS, arch, node version) with startAsync and resumeAsync calls so workflow runs are visible and filterable in Mastra Studio. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(init): generate unique trace ID for each wizard run

57f902f

Import randomBytes and generate a hex trace ID so all suspend/resume calls within a single wizard run share one trace. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(init): flatten nested workflow spans with shared parent span ID

0c5e440

Add a synthetic parentSpanId to tracingOptions so all workflow run spans become siblings under the same parent instead of nesting by timestamp containment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(init): remove unnecessary parentSpanId from tracing options

d60e3b2

The parentSpanId was creating artificial nesting - let the workflow engine handle span hierarchy naturally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: added auth headers in the mastra client (#264)

1e76a55

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

fix(init): update MASTRA_API_URL to production worker endpoint

077119a

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(init): add flask and python profiling doc URLs

bcb10f2

Add Sentry doc URLs for python-flask (getting-started, errors, tracing, logs, profiling) and add the shared python/profiling page to both flask and fastapi profiling entries. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(init): add nextjs doc URLs for eval ground truth

129e7b7

Add Sentry doc URLs for all nextjs features: getting-started, errors, logs, tracing, session replay, metrics, and profiling (browser + node). Sourcemaps left empty for now. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(init): add sveltekit doc URLs for eval ground truth

b6c10b7

Add Sentry doc URLs for sveltekit features and add missing logs, metrics, and profiling features to the platform entry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(init): add react-vite doc URLs for eval ground truth

a8156c0

Add Sentry doc URLs for react-vite features and add missing logs, metrics, and profiling features to the platform entry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(init): add python-fastapi eval test and gitignore package-lock

ce9614f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style(init): fix lint formatting in eval test files

3227619

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-advanced-security bot found potential problems Feb 23, 2026

View reviewed changes

.github/workflows/init-eval.yml Fixed Show fixed Hide fixed

betegon and others added 3 commits February 23, 2026 22:16

ci(init): add minimal permissions to init-eval workflow

04ae63d

Restrict GITHUB_TOKEN to contents:read as flagged by CodeQL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ci(init): run init-eval on PRs and main pushes

102baa6

Add push/pull_request triggers so the eval runs automatically alongside other CI checks. Keep workflow_dispatch for manual single-platform runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

betegon temporarily deployed to init-eval February 24, 2026 10:42 — with GitHub Actions Inactive

betegon and others added 4 commits February 26, 2026 10:53

refactor(init): extract magic values into named constants

b8cddd5

Move hardcoded numeric values, string literals, and exit codes into constants.ts for better readability and maintainability across the init module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge remote-tracking branch 'origin/main' into claude/funny-jang

b20089e

github-advanced-security bot found potential problems Feb 27, 2026

View reviewed changes

test/lib/init/formatters.test.ts Dismissed Show dismissed Hide dismissed

betegon and others added 4 commits March 2, 2026 16:55

merge: resolve conflicts with main branch

9fb739d

Keep both openai and marked dependencies, add test:init-eval script back, and take main's version (0.14.0-dev.0) and restructured package.json layout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'feat/init-command' of https://github.com/getsentry/cli …

20a4431

…into feat/init-command

betegon marked this pull request as ready for review March 2, 2026 18:07

sentry bot reviewed Mar 2, 2026

View reviewed changes

src/lib/init/local-ops.ts Show resolved Hide resolved

cursor bot reviewed Mar 2, 2026

View reviewed changes

src/lib/init/local-ops.ts Show resolved Hide resolved

sentry bot reviewed Mar 2, 2026

View reviewed changes

src/lib/init/wizard-runner.ts Show resolved Hide resolved

cursor bot reviewed Mar 2, 2026

View reviewed changes

src/lib/init/local-ops.ts Show resolved Hide resolved

betegon and others added 2 commits March 2, 2026 19:35

sentry bot reviewed Mar 2, 2026

View reviewed changes

src/lib/init/local-ops.ts Outdated Show resolved Hide resolved

cursor bot reviewed Mar 2, 2026

View reviewed changes

src/lib/init/local-ops.ts Outdated Show resolved Hide resolved

betegon and others added 2 commits March 2, 2026 20:33

fix: close file descriptor on readSync failure in readFiles

241eb0c

Wrap fd operations in try/finally so fs.closeSync is always called, even if fs.readSync throws an I/O error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sentry bot reviewed Mar 2, 2026

View reviewed changes

src/lib/init/local-ops.ts Outdated Show resolved Hide resolved

betegon and others added 3 commits March 2, 2026 20:51

fix: set process.exitCode on wizard failure paths

6ca341e

runWizard swallows errors and returns void, but most error paths were missing process.exitCode = 1, causing failed initializations to exit with code 0 in CI/CD pipelines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor bot reviewed Mar 2, 2026

View reviewed changes

src/lib/init/wizard-runner.ts Outdated Show resolved Hide resolved

src/lib/init/wizard-runner.ts Show resolved Hide resolved

src/lib/init/local-ops.ts Show resolved Hide resolved

sentry bot reviewed Mar 2, 2026

View reviewed changes

cursor bot reviewed Mar 2, 2026

View reviewed changes

Uh oh!

Conversation

betegon commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Eval suite

Phase 1: Wizard run

Phase 2: Hard assertions (deterministic)

Phase 3: LLM judge (per-feature)

Platforms

Running

Test Plan

Uh oh!

github-actions bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Semver Impact of This PR

New Features ✨

Bug Fixes 🐛

Api

Formatters

Other

Internal Changes 🔧

Uh oh!

Uh oh!

github-actions bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Results 📊

📊 Comparison with Base Branch

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sentry bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 2, 2026

Choose a reason for hiding this comment

Parentheses bypass shell metacharacter and executable blocklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

betegon commented Feb 23, 2026 •

edited

Loading

github-actions bot commented Feb 23, 2026 •

edited

Loading

github-actions bot commented Feb 23, 2026 •

edited

Loading